Empower your research!

Expand your knowledge at the 2024 Compute Ontario Summer School

Dates: June 3 to 21, 2024
Time: Monday to Friday from 9am to 4:30pm Eastern Time (with 1.5h lunch break)
Where: Virtual/Online
Cost: Free

COSS2024 Courses Are Still Available...

Although Compute Ontario Summer School 2024 has ended, you can still access the course materials, recordings, etc. online by registering in the course.

Event Description

The Compute Ontario Summer School offers a comprehensive curriculum packed with 40 courses. Delivered by experts in the field, these sessions cover a wide range of topics including Advanced Research Computing (ARC), High Performance Computing (HPC), Research Data Management (RDM), and Research Software (RS). With presentations and workshops available at introductory to intermediate levels, there is something for everyone. Link to announcement/introduction video.

Registration

To register in to a course:

Click the Register link associated with that course.
Click that course's Enrol button.
If/When prompted, sign in with your Alliance (CCDB) or Compute Ontario Training account.

If you do not already have an Alliance (CCDB) or Compute Ontario Training account, an account can be created using links on our login page.

NOTE Most courses do NOT require you to have an Alliance account. Courses that do require an Alliance account are noted in those course descriptions.

We also have a frequenty asked questions (FAQ) page for this event.

Stream	When (EDT)	Course
Week 1
S1	:: Mon. June 3 :: 09:00 to 12:00 13:30 to 16:30	Introduction to Linux shell :: Register :: Link to Course :: n/a :: Running programs on the supercomputers is done via the BASH shell. This course is two three hour live demos on using bash. No prior familiarity with bash is assumed. In addition to the basics of getting around, globbing, regular expressions, redirection, pipes, and scripting will be covered. A series of exercises are required to be done in order to complete the course. Level: Introductory Length: Two 3-Hour Sessions Format: Lecture + Hands-on Prerequisites: None
S2	:: Mon. June 3 :: 09:00 to 12:00	Data Preparation :: Register :: Link to Course :: n/a :: This course provides you with essential knowledge and skills to effectively prepare data for analysis. Starting with an overview of the Data Analytics pipeline and processes, the course explores various statistical and visualization techniques used in Exploratory and Descriptive Analytics to understand historical data. You will then delve into the art of Data Preparation, gaining expertise in data cleaning, handling missing values, detecting, and handling outliers, as well as transforming and engineering features. By the end of the course, you will be equipped with the necessary tools to ensure data quality and integrity, enabling you to make informed decisions and derive valuable insights from their data. Level: Introductory Length: 3 Hours Format: Lecture + Hands-on Prerequisites: Basic Python
S2	:: Mon. June 3 :: 13:30 to 16:30	Data Security :: Register :: Link to Course :: n/a :: Be aware. Stay secure. Join us to learn more about the tools you can use to prevent the theft of your data and possibly of your identity. Other topics of discussion will include common hacking attempts, how to recognize them, and how to avoid having your data compromised, stolen, or destroyed. We will also talk about data encryption and provide tips for when travelling with electronic devices. Level: Introductory Length: 3 hours Format: Lecture Prerequisites: None
S1	:: Tues. June 4 :: 09:00 to 12:00	Introduction to Advanced Research Computing :: Register :: Link to Course :: n/a :: This workshop is a primer for those largely new to supercomputing, i.e., to computing on shared, remote resources. It is intended to demystify the somewhat intimidating term "High-Performance Computing" (HPC), and to serve as a foundation upon which to build over the coming days. Topics will include motivation for HPC, available resources, essential issues, and a high level overview of parallel programming models commonly used on these systems. Level: Introductory Length: 3 Hours Format: Lecture + Hands-on Prerequisites: Basic Linux e.g. "Introduction to Linux Shell"
S2	:: Tues. June 4 :: 09:00 to 12:00	Bioinformatics: Analysis of RNA-Sequencing Data :: Register :: Link to Course :: n/a :: RNA-Seq refers to high throughput sequencing methods that probes the entire transcriptomic landscape of a given tissue or sample of interest. The data acquired from such experiments can be used to explore the overall RNA profile of a sample as well as comparing samples under various conditions. While extremely powerful, RNA-Seq is susceptible to numerous experimental pitfalls and requires intimate knowledge of the experimental procedures and data analysis methods. When conducted properly RNA-Seq can reveal information about gene/transcript expression, splicing and the effects of mutations. In this session we will take a thorough look at a comprehensive RNA-Seq pipeline, from sample processing methods to final differential expression analysis. Relevant R / BioConductor packages will be introduced. We will have the opportunity to investigate numerous quality control metrics, perform genomic alignment, differential expression and pathway enrichment analysis. We will cover several “gotcha”s and common mistakes in experimental design and data analysis. Basic familiarity with R and Linux command line will be beneficial but not required. All necessary commands and parameters will be explained during the class. Participants will be offered hands-on practice in which they will use RStudio to run R/BioConductor scripts for data analysis as well as the Integrative Genomic Viewer (IGV) software to visualize genomic data on their laptops Level: Intermediate Length: 3 Hours Format: Lecture + Hands-on Prerequisites: Basic R and Linux beneficial but not required
S1	:: Tues. June 4 :: 13:30 to 16:30	Introduction to Version control (Git) :: Register :: Link to Course :: n/a :: Using version control for your scripts, codes, documents, papers, and even data, allows you to track changes, keep backups, and facilitate collaboration. This introductory workshop will teach you the basics of version control with the popular distributed version control software GIT. This workshop assumes that students have an understanding of basic Linux shell commands. Level: Introductory Length: 3 Hours Format: Lecture + Hands-on Prerequisites: Basic Linux
S2	:: Tues. June 4 :: 13:30 to 16:30	Bioinformatics: Long-read Sequencing Applications :: Register :: Link to Course :: n/a :: Long-read sequencing technologies enable the sequencing of DNA fragments 10KB and longer. This read length greatly improves sequence mappability and assembly, providing an advantage over short-read sequences that are difficult to map uniquely to repetitive and GC-rich regions. Long-read sequencing has applications in a number of fields including genome assembly, diagnosis of genetic diseases, and metagenomics. In this workshop, we will focus on PacBio HiFi sequences and introduce you to tools for haplotyping, calling and visualizing structural variants and repeat expansions, visualizing read methylation, and detecting novel isoforms from PacBio Iso-Seq data. Participants will be offered hands-on practice in which they will use RStudio to run R/BioConductor scripts for data analysis as well as the Integrative Genomic Viewer (IGV) software to visualize genomic data on their laptops Level: Intermediate Length: 3 Hours Format: Lecture + Hands-on Prerequisites: Basic R
S1	:: Wed. June 5 :: 09:00 to 12:00 13:30 to 16:30	Introduction to Python :: Register :: Link to Course :: n/a :: This course is designed to provide you with a solid foundation in Python programming language. Through a comprehensive curriculum and hands-on coding exercises, participants will learn the fundamentals of Python syntax, data types, functions, and file handling. By the end of the course, you will have gained the essential skills to write Python programs, solve problems, and build the foundation for more advanced Python development. Whether you are a beginner or have some programming experience, this course will equip you with the necessary tools to start your journey in Python programming. Level: Introductory Length: Two 3-Hour Sessions Format: Lecture + Hands-on Prerequisites: None
S2	:: Wed. June 5 :: 09:00 to 10:20	Research Data Management: Rationale for Reproducibility :: Register :: Link to Course :: n/a :: The role of good research data management practices in supporting research reproducibility is becoming increasingly well known. The literature is replete, however, with examples of poor methodology, lack of transparency, mistakes, and misconduct leading to bad science and an inability to reproduce results. This introductory session will provide real-world, illustrative examples of each of these, along with practical suggestions on how to avoid them. Level: Introductory Length: 1.5 Hours Format: Lecture Prerequisites: None
S2	:: Wed. June 5 :: 10:30 to 11:50	From the I-Ching to ChapGPT: A Brief History of AI and Some Historical and Current Applications :: Register :: Link to Course :: n/a :: Google's 2017 research paper "Attention Is All You Need" described the transformer, a new machine learning technique. From that paper the modern Large Language Model was born, and we're now living in the thick of a new era brought on by companies like OpenAI, Mistral and Anthropic. But where does this cutting-edge technology come from? What are its roots? What are its problems? This talk explores the history of procedural generation in text and games, from the I-Ching to tranformer-based language models and beyond. The talk will emphasize current state of the art in text-based language models, and include demonstrations on how to run language models locally on your own hardware. Level: Introductory Length: 1.5 Hours Format: Lecture Prerequisites: None
S2	:: Wed. June 5 :: 13:30 to 14:50	Using Generative AI Tools for Research Data Management :: Register :: Link to Course :: n/a :: In this workshop, we will explore the potential uses of generative artificial intelligence tools in research data management (RDM) with a focus on specific use cases. For example, can AI tools be used to write Data Management Plans, summarize funder requirements, assist with data analysis, or suggest file naming conventions and folder structures? This workshop will be interactive, and participants will be welcome to practice using AI tools along with the presenters using real-world data and prompts. We will also discuss the ethical considerations, including benefits and risks, of using AI tools in research and whether it is possible to use AI for RDM practices in an ethical manner. Level: Introductory Length: 1.5 Hours Format: Lecture + Hands-on Prerequisites: None
S2	:: Wed. June 5 :: 15:00 to 16:20	Introduction to Alliance RDM Services :: Register :: Link to Course :: n/a :: This session provides an overview of the Research Data Management (RDM) Services offered by the Digital Research Alliance of Canada, including the DMP Assistant, a national, bilingual platform for the creation and management of data management plans (DMPs), the Federated Research Data Repository (FRDR), a bilingual publishing platform for sharing and preserving Canadian research data, and Lunaris, Canada’s national discovery service for multidisciplinary data from over 90 academic, government, and research repositories across the country. This session will introduce participants to these platforms and provide an overview of how they support the research lifecycle. Attendees will gain valuable insights into the benefits of these tools and how they can help researchers to streamline their data management workflows. Presenter Biographies: Neha Milan serves as the Product Lead for the Federated Research Data Repository (FRDR), a pivotal role that sees her overseeing the ongoing design and development of the FRDR platform. Based at the University of Saskatchewan, Neha is at the forefront of the FRDR Sensitive Data Pilot Project, steering its direction and implementation. Laura Gerlitz is a Curation Officer for the Federated Research Data Repository (FRDR), based out of Edmonton, Alberta. With a background in library and information studies and digital humanities, Laura specializes in metadata for the FRDR platform. Shlomi Linoy is a Research Data Analyst and Data Discovery Metadata Specialist at McMaster University, specializing in data discovery metadata for the Lunaris platform. Marcus Closen works on the DMP service team for the Digital Research Alliance of Canada. He is in the late stages of completing a PhD in political science at the University of Toronto (working with mixed-methods, as well as side projects in machine learning applications) and holds a masters degree from the University of Manitoba. Level: Introductory Length: 1.5 Hours Format: Lecture Prerequisites: None
S1	:: Thurs. June 6 :: 09:00 to 12:00	AI Showcase :: Register :: Link to Course :: n/a :: This course introduces Artificial Intelligence (AI), a science focusing on developing intelligent systems capable of autonomous behavior. In this course, we explore the exciting world of AI, introducing its definition and history. We discuss the advantages and challenges of AI in the present time, along with various applications and projects that demonstrate its capabilities. Throughout the session, participants will gain insights into different types of AI, learn about running predefined projects, and discover AI showcases on various platforms. By the end of the course, participants will have the knowledge and resources to start their own AI projects with their data, exploring the latest AI advancements in our clusters. Level: Introductory Length: 3 Hours Format: Lecture Prerequisites: Basic Python beneficial but not required
S2	:: Thurs. June 6 :: 09:00 to 12:00	High-Performance I/O and Storage :: Register :: Link to Course :: n/a :: This workshop will help you understand the relation between storage systems and application-level performance. We will survey the design of storage found on national systems, and consider their performance implications. A range of different IO techniques, data formats, and libraries will be considered. Ideally, participants should have an account on the National Platform (DRI). Level: intermediate, examples/exercises will be in Python; having a DRAC account will be helpful. Level: Intermediate Length: 3 Hours Format: Lecture + Hands-on Prerequisites: Alliance Account, Python Experience
S1	:: Thurs. June 6 :: 13:30 to 16:30	R :: Register :: Link to Course :: n/a :: This half-day session offers a brief introduction to R, with a focus on data analysis and statistics. We will discuss the following topics: the R interface, primitive data types, lists, vectors, matrices, and data frames - a crucial data type in data analysis and the trademark of the R language. Advanced topics to be covered include: basics statistics and function creation; and the basics of scripting. Level: Introductory Length: 3 Hours Format: Lecture + Hands-on Prerequisite: Some programming experience in another programming language
S2	:: Thurs. June 6 :: 13:30 to 14:50	Supporting Research with Data Management Plans and the DMP Assistant! :: Register :: Link to Course :: n/a :: This session will provide participants with information, guidance, and resources for supporting research through the development and implementation of data management plans (DMPs). General topics covered will include the importance and benefits of DMPs, their content, and impending DMP requirements relating to the Tri-Agency research data management (RDM) policy. Specific focus will be given to the Digital Research Alliance of Canada DMP Assistant platform that is hosted nationally at the University of Alberta Library, along with a new DMP template developed by the Alliance’s DMP Expert Group (DMPEG). This new template is targeted specifically to support researchers in meeting DMP requirements at the funding opportunity application stage. Additional information relating to an accompanying assessment rubric that is currently in development will be shared. Time will be reserved for questions and discussion. Biography: James Doiron is the Research Data Management Strategies Director, University of Alberta Library, and Academic Director of the UofA Research Data Centre. Locally, James serves on UofA’s Institutional Research Data Management Strategy Working Group (chair), Indigenous Research Strategy Task Force, and Health Research Ethics Board. Nationally, he serves as a member of the Canadian Research Data Centre Network Board of Directors and is co-chair of the Digital Research Alliance of Canada’s Data Management Planning (DMP) Expert Group. Level: Introductory Length: 1.5 Hours Format: Lecture Prerequisites: None
S2	:: Thurs. June 6 :: 15:00 to 16:20	Empowering Open Science: An Introduction to Depositing and Sharing Research Data and Code in Borealis :: Register :: Link to Course :: n/a :: The reproducibility of research is essential to the scientific community, as it ensures the accuracy and reliability of research findings that are used to build upon existing knowledge. However, reproducibility is often hindered by the lack of access to research data, documentation, and code. This workshop will provide an overview of the concepts of open science, reproducibility, and the FAIR principles of research data, as well as explore how to deposit and share data in Borealis, the Canadian Dataverse Repository, a bilingual, multidisciplinary, secure, Canadian research data repository, supported by academic libraries and research institutions across Canada. The learning objectives of the workshop include: Understand the Canadian context of sharing data as it relates to the FAIR principles and the importance of scientific reproducibility Gain skills related to depositing and sharing research data, documentation, and code in Borealis Explore Borealis features to support reproducibility and effective reuse of research data, including Computational Workflow Metadata and uploading from GitHub. Participants will have the opportunity to search and access sample datasets and code, with a focus on real world examples and use cases. By the end of the workshop, participants will have gained skills and knowledge related to depositing and sharing research data, documentation, and code with an emphasis on openness and reproducibility, improving the quality and impact of their research. Level: Introductory Length: 1.5 Hours Format: Lecture Prerequisites: None
S1	:: Fri. June 7 :: 09:00 to 12:00 13:30 to 16:30	Introduction to C :: Register :: Link to Course :: n/a :: This course will provide hands-on experience on fundamental concepts of programming using C. This will include Conditional statement, Loops(while and for), Arrays, Pointers, Functions and Dynamic memory allocation. An introduction will be provided regarding fundamental data structures such as linked list, stacks, queues and binary trees. Level: Introductory Length: Two 3-Hour Sessions Format: Lecture + Hands-on Prerequisites: None
S2	:: Fri. June 7 :: 09:00 to 10:20	Academic Libraries and Machine Learning: Transforming the Library :: Register :: Link to Course :: n/a :: The application of machine learning (ML) to academic libraries promises to be transformational. A Task Force of the Ontario Council of University Libraries (OCUL) has been exploring this technology and identifying specific ML use cases. OCUL is an association of the 21 university libraries in Ontario who collaborate on many shared services and resources. This session will review the work of the Task Force with a focus on use cases, and the requirements and processes to implement pilot programs and production services. Particular attention will be placed on the technology infrastructure (compute, software) and the expertise requirements (technology, domain). Use cases to be discussed include audio to text transcription, metadata creation, virtual reference (chat), and discovery using natural language processing (NLP), semantic search, and summarization. The discovery use case will be applied to some of the extensive data collections maintained by Scholar Portal, the shared resource managed by OCUL, including over 65 million articles from over 27,000 full text scholarly journals and a collection of over 800K digital books and government documents. Participants will be encouraged to engage with key questions about the adoption and use of machine learning in libraries and to provide feedback on the ongoing evolution of this technology as it benefits library applications. Level: Introductory Length: 1.5 Hours Format: Lecture Prerequisites: None
S2	:: Fri. June 7 :: 13:30 to 14:50	Working with Jupyter on the Clusters :: Register :: Link to Course :: n/a :: Jupyter Notebook is commonly used for interactive computing in Python. This session provides the options and features for working with Jupyter on the Digital Research Alliance of Canada's remote computing clusters and demonstrates several use case examples on the clusters. Level: Introductory Length: 1.5 Hours Format: Lecture + Demonstration Prerequisites: Basic Python and Linux command line experience.
S2	:: Fri. June 7 :: 15:00 to 16:20	Using Odesi for Survey and Public Opinion Research :: Register :: Link to Course :: n/a :: Odesi (https://odesi.ca) is a Canadian social science data repository and online data exploration and analysis tool. Odesi’s collections include over 5,700 historical and contemporary surveys and public opinion polls from a variety of data providers such as Statistics Canada and the Canadian Opinion Research Archive (CORA). This workshop will demonstrate how to effectively search for and access data within Odesi on a variety of social, economic, and political topics. Attendees will learn how to navigate the interface, using search features and available collections, explore survey questions (variables), perform basic tabulations and analysis using connected tools, and download datasets into statistical software (e.g. R, SPSS) for further analysis. Level: Introductory Length: 1.5 Hours Format: Lecture Prerequisites: None
Week 2
Stream	When (EDT)	Course
S1	:: Mon. June 10 :: 09:00 to 12:00	Text Mining :: Register :: Link to Course :: n/a :: This workshop introduces the topic of text mining and its applications. It covers different encoding mechanisms to convert text into numbers that algorithms can handle. It gives an overview of different text mining tasks, including de-identification, sentiment analysis and document clustering, and how they work with examples and live demos. There will also be references to state-of-the-art tools and libraries to conduct various text mining tasks. Level: Introductory Length: 3 Hours Format: Lecture + Hands-on Prerequisites: Basic Python
S2	:: Mon. June 10 :: 09:00 to 12:00 :: Wed. June 12 :: 13:30 to 16:30	Introduction to Scalable and Accelerated Data Analytics :: Register :: Link to Course :: n/a :: Some popular Python libraries for data analytics, like Numpy, Pandas, Scikit-Learn, etc., usually work well if the dataset fits into the RAM on a single machine. When dealing with large datasets, it could be a challenge to work around memory constraints. This course introduces scalable and accelerated data analytics with Dask and RAPIDS. Dask provides a framework and libraries that can handle large datasets on a single multi-core machine or across multiple machines on a cluster. RAPIDS, on the other hand, can accelerate your data analytics by offloading analytics workloads to GPUs with less effort in code changes. Level: Introductory Length: Two 3-Hour Sessions (2 Days) Format: Lecture + Hands-on Prerequisites: Alliance Account Basic Python and Linux command line experience.
S1	:: Mon. June 10 :: 15:00 to 16:00 :: Wed. June 12 :: 15:00 to 16:30 :: Fri. June 14 :: 15:00 to 16:30	Leveraging HPC for Computational Fluid Dynamics :: Register :: Link to Course :: n/a :: Computational Fluid Dynamics (CFD) is a field of computational physics that has a very high utilization of modern Advanced Research Computing (ARC) resources. The spatial and temporal resolution required to solve modern CFD problems means that it is well suited to take advantage of the full benefits of large-scale distributed memory parallelization that is available on high-performance computing (HPC) systems. As CFD tools have progressed over the past years, their robustness, predictive capabilities, and user-friendliness have drastically improved, which means that these tools are increasingly being adopted by non-traditional HPC users such as new graduate students, experimentalists, theoreticians, and even student design teams. This course is intended to help learners with a basic understanding of fluid dynamics and CFD bridge the knowledge gap towards the effective utilization of CFD on modern HPC architectures. This course will take an end-user approach to CFD tools on HPC systems (no coding) and, despite some prerequisites, will be given at an introductory/intermediate level (we will not cover advanced topics such as GPU or dynamic load-balancing). At the end of the course, the learner will be able to: Develop a systematic approach to estimate the HPC cost of a CFD problem. Explain the impact of modelling assumptions on HPC cost. Optimize the parameters and simulations for effective HPC usage. The course will use an entirely open source suite of CFD toolsets to mesh (Gmsh), simulate (OpenFoam/SU2), and visualize (Visit/Paraview). It should be noted that this is not a CFD course; therefore, undergraduate-level knowledge of CFD and numerical methods is expected, as well as a basic understanding of the Compute Ontario HPC system. The focus is on the effective use of CFD tools in modern HPC systems. Course structure: The course will be delivered in a hybrid format with three 1.5-hours blocks covering the following topics: Planning and estimating HPC requirement (1.5 hours). Optimizing CFD simulations for HPC (1.5 hours). Running, visualizing, and organizing data (1.5 hours). The synchronous segments will be used to explain the important theories and concepts and introduce the tutorial problem that the learners will be asked to work on individually between the synchronous sessions. The course is developed based on the newly developed ARC4CFD (https://arc4cfd.github.io) course. Level: Intermediate Length: Three 1.5-Hour Sessions (3 Days) Format: Lecture + Hands-on Prerequisites: Undergraduate-level knowledge of fluid dynamics (ideally with some knowledge of turbulence), CFD, and numerical methods.
S2	:: Mon. June 10 :: 13:30 to 16:30	Reproducible Research: Practices and Tools :: Register :: Link to Course :: n/a :: Have you ever tried to run someone else’s code and it just didn’t work? Have you ever been lost interpreting your colleague’s data? This hands-on session will provide researchers with tools and techniques to make their research process more transparent and reusable in remote computing environments. You’ll be using platforms like JupyterHub and command-line tools like Bash and Docker in a Linux environment to interact with the material through various exercises and examples. In this workshop, you’ll learn about: organizing your file directories writing readable metadata with README files automating your workflow with scripts capture and share your computational environment Level: Introductory Length: 3 hours Format: Lecture + Hands-on Prerequisites: Initial familiarity with command line tools and/or a Linux environment may be beneficial but not mandatory
S1	:: Tues. June 11 :: 09:00 to 12:00 13:30 to 16:30	Multicore Parallel Programming (OpenMP) :: Register :: Link to Course :: n/a :: This is an introduction to the intermediate level OpenMP hand-on course. OpenMP is a standard parallel programming API that supports multi-platform shared-memory multiprocessing programming in C, C++, and Fortran. This one-day course will cover the principles of OpenMP compiler directives, library routines, and environment variables with step-by-step hand-on examples. Case studies include various approaches for loop parallelism. We will also talk about the Task constructs for irregular programs, and the Target constructs for accelerators such as GPU. Participants will have hand-on programming experience with OpenMP as well as how to compile and run Multi-thread OpenMP code on different alliance clusters. Level: Introductory Length: Two 3-Hour Sessions Format: Lecture + Hands-on (Hands-on portion is CPU only.) Prerequisites: Basic knowledge of C, C++, or Fortran
S2	:: Tues. June 11 :: 09:00 to 12:00 13:30 to 16:30	Machine Learning :: Register :: Link to Course :: n/a :: This course provides an introduction to machine learning that enables computers to learn AI models from data without being explicitly programmed. It comprises two parts: Part I covers the fundamentals of machine learning, and, Part II demonstrates the applications of various machine methods in solving a real world problem. Rather than presenting the key concepts and components of machine learning in an abstract way, this course introduces them with a small number of examples. By using plotting and animations, insight into some of the mechanics of machine learning can be had. Furthermore, the student will gain practical skills in a case study, in which each step of developing a machine learning project is presented. By the end of this course, the student will have a solid understanding and experience with some of the fundamentals of machine learning enabling subsequent exploration. Level: Introductory to Intermediate Length: Two 3-Hour Sessions Format: Lecture + Hands-on Prerequisites: Data preparation or equivalent knowledge. Basic Python knowledge and experience. Knowledge and experience with Tensorflow and Scikit-learn would also be helpful.
S1	:: Wed. June 12 :: 09:00 to 12:00	Parallel Computing with MATLAB :: Register :: Link to Course :: n/a :: During this hands-on workshop, we will introduce parallel and distributed computing in MATLAB with a focus on speeding up application codes and offloading compute. By working through common scenarios and workflows using hands-on demos, you will gain a detailed understanding of the parallel constructs in MATLAB, their capabilities, and some of the common hurdles that you'll encounter when using them. Users will learn: Multithreading vs multiprocessing When to use parfor vs parfeval constructs Creating data queues for data transfer Leveraging NVIDIA GPUs Parallelizing Simulink models Working with large data Level: Intermediate Length: 3 Hours Format: Lecture + Hands-on Prerequisites: Working knowledge of MATLAB
S2	:: Wed. June 12 :: 09:00 to 12:00 :: Thurs. June 13 :: 09:00 to 12:00 :: Fri. June 14 :: 09:00 to 12:00 13:30 to 16:30	Artificial Neural Networks (Deep Learning) :: Register :: Link to Course :: n/a :: NOTE: This course is divided into four (4) parts over three (3) days. Part I and Part II Description: Introduction of neural network programming concepts, theory, and techniques. The class material will begin at an introductory level, intended for those with no experience with neural networks, eventually covering intermediate concepts. (The Keras neural network framework will be used for neural network programming but no experience with Keras will be expected.) Part III Description: This part will continue the development of neural network programming approaches from Parts I and II. This part will focus on generative methods used to create images: variational auto-encoders, generative adversarial networks, and diffusion networks. Part IV Description: This part will continue the development of neural network programming approaches from Parts I through III. This part will focus on methods used to generate sequences: LSTM networks, sequence-to-sequence networks, and transformers. Level: Intermediate Length: Four 3-Hour Sessions (3 Days) Format: Lecture + Hands-on Prerequisites: Experience with Python (version 3.10) is assumed. Each part assumes what was covered in the previous parts of this course. Parts III and IV assume experience with neural network programming, per the first two neural network programming sessions in this course.
S1	:: Thurs. June 13 :: 09:00 to 12:00 13:30 to 16:30	Using Containers: Apptainer :: Register :: Link to Course :: n/a :: Apptainer is a secure container technology designed to be used on for high performance compute clusters. This workshop will focus on how to use Apptainer as well as how to make use of tools such as Conda and Spack within Apptainer. By the end of these sessions, one: will have learnt about Apptainer and how it is installed and used on our computer clusters, how to build an Apptainer container image, how to install tools such as Conda/Spack from inside an Apptainer container shell, and, how to use Apptainer containers within job submission scripts. Level: Introductory Length: Two 3-Hour Sessions Format: Lecture + Hands-on Prerequisites: Experience using Alliance compute clusters, e.g., using the BASH shell and submitting jobs.
S2	:: Thurs. June 13 :: 13:30 to 16:30	oneAPI Library and Programming Model for Image Inferencing for CPU and GPU :: Register :: Link to Course :: n/a :: oneAPI is a unified application programming interface intended to be used across different compute accelerator architectures, including CPUs, GPUs and AI accelerators. It's aim is to unify the programming model as well as simplifying cross-architecture development. It also provides libraries for: deep neural network (DNN) learning applications, collective communications for machine learning and deep learning projects, and, data analytics making big data analysis faster using optimized algorithms. By the end of this workshop one will have: learned about oneAPI libraries and the inference toolkit, components, and capabilities for developing and deploying computer vision and deep learning solutions, explored techniques for optimizing pre-trained deep learning models and learn how to work with models from different frameworks like Tensorflow, PyTorch and Caffe, understood how to perform inference on different hardware such as CPU and GPU, and, considered practical computer vision applications and use cases. Level: Intermediate Length: 3 Hours Format: Lecture + Hands-on Prerequisites: Attendees having hands-on experience with Python and some experience with Tensorflow or PyTorch will get the most out of this workshop.
S1	:: Fri. June 14 :: 09:00 to 12:00	Machine Learning with MATLAB :: Register :: Link to Course :: n/a :: Machine learning is a data analytics technique that teaches computers to do what comes naturally to humans and animals: learn from experience. Machine learning algorithms use computational methods to “learn” information directly from data without relying on a predetermined equation as a model. In this hands-on introductory workshop, you will learn how to apply Machine Learning, and get familiar with the basics of Deep Learning. MATLAB provides an environment to apply advanced techniques without requiring extensive coding nor experience in machine learning. Learn the fundamentals of machine learning (supervised learning, feature extraction, and hyperparameter tuning) Explore pre-processing and powerful visualization techniques Build and evaluate machine learning models for classification and regression of various data formats (signals, images, text) Perform hyperparameter tuning and feature selection to optimize model performance Discuss interoperability with other platforms Learn how to deploy Machine Learning models Level: Intermediate Length: 3 Hours Format: Lecture + Hands-on Prerequisites: Working knowledge of MATLAB
Week 3
Stream	When (EDT)	Course
S1	:: Mon. June 17 :: 09:00 to 12:00 13:30 to 16:30 :: Tues. June 18 :: 09:00 to 12:00 13:30 to 16:30 :: Wed. June 19 :: 09:00 to 12:00 13:30 to 16:30	GPU Programming :: Register :: Link to Course :: n/a :: This is an introductory course covering programming and computing on GPUs - graphics processing units - which are an increasingly common presence in massively parallel computing architectures. The basics of GPU programming will be covered, and students will work through a number of hands on examples. The structuring of data and computations that makes full use of the GPU will be discussed in detail. Students should be able to leave the course with the knowledge necessary to begin developing their own GPU applications. Level: Introductory Length: Six 3-Hour Sessions (3 Days) Format: Lecture + Hands-on Prerequisites: Alliance Account Basic C and/or C++ experience
S2	:: Mon. June 17 :: 09:00 to 12:00 13:30 to 16:30	High Performance Computing in Python :: Register :: Link to Course :: n/a :: Learn how to improve the performance and use parallel programming in Python. We will cover profiling, subprocess, numexpr, multiprocessing, MPI, and other performance enhancing techniques. Level: Intermediate Length: Two 3-Hour Sessions Format: Lecture + Hands-on Prerequisite: Some Python and Linux command line experience.
S2	:: Tues. June 18 :: 09:00 to 12:00 13:30 to 16:30 :: Thurs. June 20 :: 09:00 to 12:00 13:30 to 16:30	Modern C++ Parallel Programming :: Register :: Link to Course :: n/a :: Modern C++ is an efficient, versatile programming language. This workshop will focus on the following in both sequential and parallel contexts: using pseudo-random number generators, making use of reduction options using underlying sequential code, making simple use of in-situ code benchmarking/profiling, and, using mdspan for accessing multi-dimensional arrays and multi-dimensional array slices (submdspan). By the end of these sessions, one will have learnt about sequential and parallel uses of: C++ pseudo-random number generators and their use, C++ std::reduce(), std::transform_reduce(), etc. and C++ parallel algorithms and some of their uses and caveats, using std::chrono facilities, e.g., for in-situ benchmarks, and, how to use multi-dimensional arrays and slices in C++ code. Level: Intermediate Length: Four 3-Hour Sessions (2 Days) Format: Lecture + Hands-on Prerequisites: Experience developing sequential code in C++. (The C++ programming language is not the C programming language. Experience is expected programming in C++, e.g., using the standard library's containers, iterators, and algorithms.)
S2	:: Wed. June 19 :: 09:00 to 12:00 13:30 to 16:30	Scientific Visualization :: Register :: Link to Course :: n/a :: During this workshop, we will learn about matplotlib which is a popular Python library that is great for 2D visualizations, and ParaView, a free and open-source visualization tool for creating 3D visualizations of your datasets. In this interactive workshop you will get familiar with how ParaView works and at the end you should be able to generate basic visualizations of the demo data. Level: Introductory Length: Two 3-Hour Sessions Format: Lecture + Hands-on Prerequisites: None
S1	:: Thurs. June 20 :: 09:00 to 12:00 13:30 to 16:30 :: Fri. June 21 :: 09:00 to 12:00 13:30 to 16:30	SQL :: Register :: Link to Course :: n/a :: In our digitally-driven world, databases are the cornerstone of virtually every online service and application. They help store your favourite songs on music platforms, track orders on shopping sites, and keep your personal information safe and sound. These incredible systems are the backbone of our digital universe, silently and efficiently managing the vast oceans of data that flow through our daily lives. From the social media sites we share with our friends to the online transactions that make our lives easier, databases are the unsung heroes, diligently organizing, storing, and retrieving information with remarkable precision. Whether you're a technical professional or just beginning to explore data management, the journey into the realm of databases is both enlightening and rewarding, offering endless opportunities for discovery and innovation. Together, we will explore the secrets that make our connected world tick. Level: Introductory Length: Four 3-Hour Sessions (2 Days) Format: Lecture+Hands-on Prerequisites: Basic programming knowledge Installation of MySQL on one's personal computer
S2	:: Fri. June 21 :: 09:00 to 12:00 13:30 to 16:30	Bioinformatics: Introduction and Metagenomics :: Register :: Link to Course :: n/a :: Bioinformatics, the interdisciplinary field at the intersection of biology and computational science, has revolutionized our understanding of life processes. In this one-day course, we will first tune your HPC knowledge/skills towards bioinformatics computing. Then a typical metagenomics pipeline will be explored to introduce common tools used in bioinformatic analysis and to show how they can be run in an HPC environment. Join us for an immersive day of hands-on exploration in the captivating world of bioinformatics and metagenomics! Level: Introductory Length: Two 3-Hour Sessions Format: Lecture + Hands-on Prerequisites: Alliance Account Basic understanding of biology and familiarity with Unix shells (e.g. bash, zsh, etc.).

Empower your research!

Expand your knowledge at the 2024 Compute Ontario Summer School

COSS2024 Courses Are Still Available...

Event Description

Registration

Introduction to Linux shell

Data Preparation

Data Security

Introduction to Advanced Research Computing

Bioinformatics: Analysis of RNA-Sequencing Data

Introduction to Version control (Git)

Bioinformatics: Long-read Sequencing Applications

Introduction to Python

Research Data Management: Rationale for Reproducibility

From the I-Ching to ChapGPT: A Brief History of AI and Some Historical and Current Applications

Using Generative AI Tools for Research Data Management

Introduction to Alliance RDM Services

AI Showcase

High-Performance I/O and Storage

R

Supporting Research with Data Management Plans and the DMP Assistant!

Empowering Open Science: An Introduction to Depositing and Sharing Research Data and Code in Borealis

Introduction to C

Academic Libraries and Machine Learning: Transforming the Library

Working with Jupyter on the Clusters

Using Odesi for Survey and Public Opinion Research

Text Mining

Introduction to Scalable and Accelerated Data Analytics

Leveraging HPC for Computational Fluid Dynamics

Reproducible Research: Practices and Tools

Multicore Parallel Programming (OpenMP)

Machine Learning

Parallel Computing with MATLAB

Artificial Neural Networks (Deep Learning)

Using Containers: Apptainer

oneAPI Library and Programming Model for Image Inferencing for CPU and GPU

Machine Learning with MATLAB

GPU Programming

High Performance Computing in Python

Modern C++ Parallel Programming

Scientific Visualization

SQL

Bioinformatics: Introduction and Metagenomics